Concept Drift Visualization ⋆ Yuan Yao
نویسندگان
چکیده
Mining data stream are facing many challenges now, one of them is concept drift problem. In many practical applications, concept drift usually affects the classification performance for data stream, or even make the classifier failed. However, most of the proposed methods are mainly focusing on solving concept drift from the data value point of view, and very little attention has been focused on mining the knowledge in the data concept level. Motivated by this, in this paper, we use Kullback-Leibler divergence (KL-divergence) algorithm to detect concept drift dynamically. Meanwhile, we also construct a concept pool to reserve distinct concepts in data stream and analyze the concept transformation information. Experimental studies on two real-world data sets demonstrate that the proposed concept visualization method and concept transformation map could effectively and efficiently mine concept drifts relationship from the noisy streaming data.
منابع مشابه
A Systematic Study of Online Class Imbalance Learning with Concept Drift
As an emerging research topic, online class imbalance learning often combines the challenges of both class imbalance and concept drift. It deals with data streams having very skewed class distributions, where concept drift may occur. It has recently received increased research attention; however, very little work addresses the combined problem where both class imbalance and concept drift coexis...
متن کاملDetecting Concept Drift in Classification Over Streaming Graphs
Detecting concept drift in data streams has been widely studied in the data mining community. Conventional drift detection methods use classifiers’ outputs (e.g., classification accuracy, error rate) as indicators to signal concept changes. As a result, their performance greatly depends on the chosen classifiers. However, there is little work on addressing concept drift in graph-structured data...
متن کاملDetecting Concept Drift in Data Stream Using Semi-Supervised Classification
Data stream is a sequence of data generated from various information sources at a high speed and high volume. Classifying data streams faces the three challenges of unlimited length, online processing, and concept drift. In related research, to meet the challenge of unlimited stream length, commonly the stream is divided into fixed size windows or gradual forgetting is used. Concept drift refer...
متن کاملContinuous time portfolio optimization
This paper presents dynamic portfolio model based on the Merton's optimal investment-consumption model, which combines dynamic synthetic put option using risk-free and risky assets. This paper is extended version of methodological paper published by Yuan Yao (2012). Because of the long history of the development of foreign financial market, with a variety of financial derivatives, the study on ...
متن کاملVisualization and Concept Drift Detection Using Explanations of Incremental Models
The temporal dimension that is ever more prevalent in data makes data stream mining (incremental learning) an important field of machine learning. In addition to accurate predictions, explanations of the models and examples are a crucial component as they provide insight into model’s decision and lessen its black box nature, thus increasing the user’s trust. Proper visual representation of data...
متن کامل